Is speech data clustered? - statistical analysis of cepstral features

نویسندگان

Tomi Kinnunen

Ismo Kärkkäinen

Pasi Fränti

چکیده

Speech analysis applications are typically based on short-term spectral analysis of the speech signal. Feature extraction process outputs one feature vector per frame. The features are further processed by application-dependent techniques, such as hidden Markov models or vector quantization. Independent from the application, it is often desirable that the feature vectors form separable clusters in the feature space. In this work, we study whether data is really clustered in the feature space and, if so, what is the number of the clusters in typical speech data. We consider different forms of the widely used cepstral features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustered ? - Statistical Analysis of Cepstral Features

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Is speech data clustered? - statistical analysis of cepstral features

نویسندگان

چکیده

منابع مشابه

Clustered ? - Statistical Analysis of Cepstral Features

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Voice-based Age and Gender Recognition using Training Generative Sparse Model

عنوان ژورنال:

اشتراک گذاری